51 research outputs found

    Complex organizational structure of the genome revealed by genome-wide analysis of single and alternative promoters in Drosophila melanogaster

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The promoter is a critical necessary transcriptional <it>cis</it>-regulatory element. In addition to its role as an assembly site for the basal transcriptional apparatus, the promoter plays a key part in mediating temporal and spatial aspects of gene expression through differential binding of transcription factors and selective interaction with distal enhancers. Although many genes have multiple promoters, little attention has been focused on how these relate to one another; nor has much study been directed at relationships between promoters of adjacent genes.</p> <p>Results</p> <p>We have undertaken a systematic investigation of <it>Drosophila </it>promoters. We divided promoters into three groups: unique promoters, first alternative promoters (the most 5' of a gene's multiple promoters), and downstream alternative promoters (the remaining alternative promoters 3' to the first). We observed distinct nucleotide distribution and sequence motif preferences among these three classes. We also investigated the promoters of neighboring genes and found that a greater than expected number of adjacent genes have similar sequence motif profiles, which may allow the genes to be regulated in a coordinated fashion. Consistent with this, there is a positive correlation between similar promoter motifs and related gene expression profiles for these genes.</p> <p>Conclusions</p> <p>Our results suggest that different regulatory mechanisms may apply to each of the three promoter classes, and provide a mechanism for "gene expression neighborhoods," local clusters of co-expressed genes. As a whole, our data reveal an unexpected complexity of genomic organization at the promoter level with respect to both alternative and neighboring promoters.</p

    Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs

    Get PDF
    Prediction of cis-regulatory modules ab initio, without any input of relevant motifs, is achieved with two novel methods

    Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses

    Get PDF
    Analysis of 280 experimentally-verified cis-regulatory modules from Drosophila reveal features both common to all and unique to distinct subclasses of modules

    REDfly v3.0: toward a comprehensive database of transcriptional regulatory elements in Drosophila

    Get PDF
    The REDfly database of Drosophila transcriptional cis-regulatory elements provides the broadest and most comprehensive available resource for experimentally validated cis-regulatory modules and transcription factor binding sites among the metazoa. The third major release of the database extends the utility of REDfly as a powerful tool for both computational and experimental studies of transcription regulation. REDfly v3.0 includes the introduction of new data classes to expand the types of regulatory elements annotated in the database along with a roughly 40% increase in the number of records. A completely redesigned interface improves access for casual and power users alike; among other features it now automatically provides graphical views of the genome, displays images of reporter gene expression and implements improved capabilities for database searching and results filtering. REDfly is freely accessible at http://redfly.ccr.buffalo.edu

    Genome-wide search identifies Ccnd2 as a direct transcriptional target of Elf5 in mouse mammary gland

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The ETS transcription factor Elf5 (also known as ESE-2) is highly expressed in the mammary gland and plays an important role in its development and differentiation. Indeed studies in mice have illustrated an essential role for Elf5 in directing alveologenesis during pregnancy. Although the molecular mechanisms that underlie the developmental block in Elf5 null mammary glands are beginning to be unraveled, this investigation has been hampered by limited information about the identity of Elf5-target genes. To address this shortcoming, in this study we have performed ChIP-cloning experiments to identify the specific genomic segments that are occupied by Elf5 in pregnant mouse mammary glands.</p> <p>Results</p> <p>Sequencing and genomic localization of <it>cis</it>-regulatory regions bound by Elf5 <it>in vivo </it>has identified several potential target genes covering broad functional categories. A subset of these target genes demonstrates higher expression levels in Elf5-null mammary glands suggesting a repressive functional role for this transcription factor. Here we focus on one putative target of Elf5, the <it>Ccnd2 </it>gene that appeared in our screen. We identify a novel Elf5-binding segment upstream of the <it>Ccnd2 </it>gene and demonstrate that Elf5 can transcriptionally repress Ccnd2 by directly binding to the proximal promoter region. Finally, using Elf5-null mammary epithelial cells and mammary glands, we show that loss of Elf5 <it>in vivo </it>leads to up regulation of Ccnd2 and an altered expression pattern in luminal cells.</p> <p>Conclusions</p> <p>Identification of Elf5-targets is an essential first step in elucidating the transcriptional landscape that is shaped by this important regulator. Our studies offer new toolbox in examining the biological role of Elf5 in mammary gland development and differentiation.</p

    An Integrated Strategy for Analyzing the Unique Developmental Programs of Different Myoblast Subtypes

    Get PDF
    An important but largely unmet challenge in understanding the mechanisms that govern the formation of specific organs is to decipher the complex and dynamic genetic programs exhibited by the diversity of cell types within the tissue of interest. Here, we use an integrated genetic, genomic, and computational strategy to comprehensively determine the molecular identities of distinct myoblast subpopulations within the Drosophila embryonic mesoderm at the time that cell fates are initially specified. A compendium of gene expression profiles was generated for primary mesodermal cells purified by flow cytometry from appropriately staged wild-type embryos and from 12 genotypes in which myogenesis was selectively and predictably perturbed. A statistical meta-analysis of these pooled datasets—based on expected trends in gene expression and on the relative contribution of each genotype to the detection of known muscle genes—provisionally assigned hundreds of differentially expressed genes to particular myoblast subtypes. Whole embryo in situ hybridizations were then used to validate the majority of these predictions, thereby enabling true-positive detection rates to be estimated for the microarray data. This combined analysis reveals that myoblasts exhibit much greater gene expression heterogeneity and overall complexity than was previously appreciated. Moreover, it implicates the involvement of large numbers of uncharacterized, differentially expressed genes in myogenic specification and subsequent morphogenesis. These findings also underscore a requirement for considerable regulatory specificity for generating diverse myoblast identities. Finally, to illustrate how the developmental functions of newly identified myoblast genes can be efficiently surveyed, a rapid RNA interference assay that can be scored in living embryos was developed and applied to selected genes. This integrated strategy for examining embryonic gene expression and function provides a substantially expanded framework for further studies of this model developmental system

    PeakMatcher facilitates updated Aedes aegypti embryonic cis-regulatory element map

    Get PDF
    Background: The Aedes aegypti mosquito is a threat to human health across the globe. The A. aegypti genome was recently re-sequenced and re-assembled. Due to a combination of long-read PacBio and Hi-C sequencing, the AaegL5 assembly is chromosome complete and significantly improves the assembly in key areas such as the M/m sex-determining locus. Release of the updated genome assembly has precipitated the need to reprocess historical functional genomic data sets, including cis-regulatory element (CRE) maps that had previously been generated for A. aegypti. Results: We re-processed and re-analyzed the A. aegypti whole embryo FAIRE seq data to create an updated embryonic CRE map for the AaegL5 genome. We validated that the new CRE map recapitulates key features of the original AaegL3 CRE map. Further, we built on the improved assembly in the M/m locus to analyze overlaps of open chromatin regions with genes. To support the validation, we created a new method (PeakMatcher) for matching peaks from the same experimental data set across genome assemblies. Conclusion: Use of PeakMatcher software, which is available publicly under an open-source license, facilitated the release of an updated and validated CRE map, which is available through the NIH GEO. These findings demonstrate that PeakMatcher software will be a useful resource for validation and transferring of previous annotations to updated genome assemblies

    Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison

    Get PDF
    Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, ‘enhancers’), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for ‘motif-blind’ CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to ‘supervise’ the search. We propose a new statistical method, based on ‘Interpolated Markov Models’, for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers
    corecore